13 research outputs found

    Annotating Nontargeted LC-HRMS/MS Data with Two Complementary Tandem Mass Spectral Libraries

    Get PDF
    Tandem mass spectral databases are indispensable for fast and reliable compound identification in nontargeted analysis with liquid chromatography–high resolution tandem mass spectrometry (LC-HRMS/MS), which is applied to a wide range of scientific fields. While many articles now review and compare spectral libraries, in this manuscript we investigate two high-quality and specialized collections from our respective institutes, recorded on different instruments (quadrupole time-of-flight or QqTOF vs. Orbitrap). The optimal range of collision energies for spectral comparison was evaluated using 233 overlapping compounds between the two libraries, revealing that spectra in the range of CE 20–50 eV on the QqTOF and 30–60 nominal collision energy units on the Orbitrap provided optimal matching results for these libraries. Applications to complex samples from the respective institutes revealed that the libraries, combined with a simple data mining approach to retrieve all spectra with precursor and fragment information, could confirm many validated target identifications and yield several new Level 2a (spectral match) identifications. While the results presented are not surprising in many ways, this article adds new results to the debate on the comparability of Orbitrap and QqTOF data and the application of spectral libraries to yield rapid and high-confidence tentative identifications in complex human and environmental samples

    The metaRbolomics Toolbox in Bioconductor and beyond

    Get PDF
    Metabolomics aims to measure and characterise the complex composition of metabolites in a biological system. Metabolomics studies involve sophisticated analytical techniques such as mass spectrometry and nuclear magnetic resonance spectroscopy, and generate large amounts of high-dimensional and complex experimental data. Open source processing and analysis tools are of major interest in light of innovative, open and reproducible science. The scientific community has developed a wide range of open source software, providing freely available advanced processing and analysis approaches. The programming and statistics environment R has emerged as one of the most popular environments to process and analyse Metabolomics datasets. A major benefit of such an environment is the possibility of connecting different tools into more complex workflows. Combining reusable data processing R scripts with the experimental data thus allows for open, reproducible research. This review provides an extensive overview of existing packages in R for different steps in a typical computational metabolomics workflow, including data processing, biostatistics, metabolite annotation and identification, and biochemical network and pathway analysis. Multifunctional workflows, possible user interfaces and integration into workflow management systems are also reviewed. In total, this review summarises more than two hundred metabolomics specific packages primarily available on CRAN, Bioconductor and GitHub

    MSNovelist: de novo structure generation from mass spectra

    No full text
    Current methods for structure elucidation of small molecules rely on finding similarity with spectra of known compounds, but do not predict structures de novo for unknown compound classes. We present MSNovelist, which combines fingerprint prediction with an encoder-decoder neural network to generate structures de novo solely from tandem mass spectrometry (MS2) spectra. In an evaluation with 3,863 MS2 spectra from the Global Natural Product Social Molecular Networking site, MSNovelist predicted 25% of structures correctly on first rank, retrieved 45% of structures overall and reproduced 61% of correct database annotations, without having ever seen the structure in the training phase. Similarly, for the CASMI 2016 challenge, MSNovelist correctly predicted 26% and retrieved 57% of structures, recovering 64% of correct database annotations. Finally, we illustrate the application of MSNovelist in a bryophyte MS2 dataset, in which de novo structure prediction substantially outscored the best database candidate for seven spectra. MSNovelist is ideally suited to complement library-based annotation in the case of poorly represented analyte classes and novel compounds.ISSN:1548-7105ISSN:1548-709

    Inactivation and Site-specific Oxidation of Aquatic Extracellular Bacterial Leucine Aminopeptidase by Singlet Oxygen

    No full text
    Extracellular enzymes are master recyclers of organic matter, and to predict their functional lifetime, we need to understand their environmental transformation processes. In surface waters, direct and indirect photochemical transformation is a known driver of inactivation. We investigated molecular changes that occur along with inactivation in aminopeptidase, an abundant class of extracellular enzymes. We studied the inactivation kinetics and localized oxidation caused by singlet oxygen, 1O2, a major photochemically derived oxidant toward amino acids. Aminopeptidase showed second-order inactivation rate constants with 1O2 comparable to those of free amino acids. We then visualized site-specific oxidation kinetics within the three-dimensional protein and demonstrated that fastest oxidation occurred around the active site and at other reactive amino acids. However, second-order oxidation rate constants did not correlate strictly with the 1O2-accessible surface areas of those amino acids. We inspected site-specific processes by a comprehensive suspect screening for 723,288 possible transformation products. We concluded that histidine involved in zinc coordination at the active site reacted slower than what was expected by its accessibility, and we differentiated between two competing reaction pathways of 1O2 with tryptophan residues. This systematic analysis can be directly applied to other proteins and transformation reactions.ISSN:0013-936XISSN:1520-585

    MSNovelist: De novo structure generation from mass spectra

    No full text
    Structural elucidation of small molecules de novo from mass spectra is a longstanding, yet unsolved problem. Current methods rely on finding some similarity with spectra of known compounds deposited in spectral libraries, but do not solve the problem of predicting structures for novel or poorly represented compound classes. We present MSNovelist that combines fingerprint prediction with an encoder-decoder neural network to generate structures de novo from fragment spectra. In evaluation, MSNovelist correctly reproduced 61% of database annotations for a GNPS reference dataset. In a bryophyte MS2 dataset, our de novo structure prediction substantially outscored the best database candidate for seven features, and a potential novel natural product with a flavonoid core was identified. MSNovelist allows predicting structures solely from MS2 data, and is therefore ideally suited to complement library-based annotation in the case of poorly represented analyte classes and novel compounds

    Microvolume trace environmental analysis using peak-focusing online solid-phase extraction–nano-liquid chromatography–high-resolution mass spectrometry

    No full text
    Online solid-phase extraction was combined with nano-liquid chromatography coupled to high-resolution mass spectrometry (HRMS) for the analysis of micropollutants in environmental samples from small volumes. The method was validated in surface water, Microcystis aeruginosa cell lysate, and spent Microcystis growth medium. For 41 analytes, quantification limits of 0.1–28 ng/L (surface water) and 0.1–32 ng/L (growth medium) were obtained from only 88 μL of sample. In cell lysate, quantification limits ranged from 0.1–143 ng/L or 0.33–476 ng/g dry weight from a sample of 88 μL, or 26 μg dry weight, respectively. The method matches the sensitivity of established online and offline solid-phase extraction–liquid chromatography–mass spectrometry methods but requires only a fraction of the sample used by those techniques, and is among the first applications of nano-LC-MS for environmental analysis. The method was applied to the determination of bioconcentration in Microcystis aeruginosa in a laboratory experiment, and the benefit of coupling to HRMS was demonstrated in a transformation product screening.ISSN:1618-2650ISSN:1618-264

    Annotating Nontargeted LC-HRMS/MS Data with Two Complementary Tandem Mass Spectral Libraries

    No full text
    Tandem mass spectral databases are indispensable for fast and reliable compound identification in nontargeted analysis with liquid chromatography–high resolution tandem mass spectrometry (LC-HRMS/MS), which is applied to a wide range of scientific fields. While many articles now review and compare spectral libraries, in this manuscript we investigate two high-quality and specialized collections from our respective institutes, recorded on different instruments (quadrupole time-of-flight or QqTOF vs. Orbitrap). The optimal range of collision energies for spectral comparison was evaluated using 233 overlapping compounds between the two libraries, revealing that spectra in the range of CE 20–50 eV on the QqTOF and 30–60 nominal collision energy units on the Orbitrap provided optimal matching results for these libraries. Applications to complex samples from the respective institutes revealed that the libraries, combined with a simple data mining approach to retrieve all spectra with precursor and fragment information, could confirm many validated target identifications and yield several new Level 2a (spectral match) identifications. While the results presented are not surprising in many ways, this article adds new results to the debate on the comparability of Orbitrap and QqTOF data and the application of spectral libraries to yield rapid and high-confidence tentative identifications in complex human and environmental samples

    Strategies to Characterize Polar Organic Contamination in Wastewater: Exploring the Capability of High Resolution Mass Spectrometry

    No full text
    Wastewater effluents contain a multitude of organic contaminants and transformation products, which cannot be captured by target analysis alone. High accuracy, high resolution mass spectrometric data were explored with novel untargeted data processing approaches (enviMass, nontarget, and RMassBank) to complement an extensive target analysis in initial “all in one” measurements. On average 1.2% of the detected peaks from 10 Swiss wastewater treatment plant samples were assigned to target compounds, with 376 reference standards available. Corrosion inhibitors, artificial sweeteners, and pharmaceuticals exhibited the highest concentrations. After blank and noise subtraction, 70% of the peaks remained and were grouped into components; 20% of these components had adduct and/or isotope information available. An intensity-based prioritization revealed that only 4 targets were among the top 30 most intense peaks (negative mode), while 15 of these peaks contained sulfur. Of the 26 nontarget peaks, 7 were tentatively identified via suspect screening for sulfur-containing surfactants and one peak was identified and confirmed as 1,3-benzothiazole-2-sulfonate, an oxidation product of a vulcanization accelerator. High accuracy, high resolution data combined with tailor-made nontarget processing methods (all available online) provided vital information for the identification of a wider range of heteroatom-containing compounds in the environment

    A Modular and Expandable Ecosystem for Metabolomics Data Annotation in R

    No full text
    Liquid chromatography-mass spectrometry (LC-MS)-based untargeted metabolomics experiments have become increasingly popular because of the wide range of metabolites that can be analyzed and the possibility to measure novel compounds. LC-MS instrumentation and analysis conditions can differ substantially among laboratories and experiments, thus resulting in non-standardized datasets demanding customized annotation workflows. We present an ecosystem of R packages, centered around the MetaboCoreUtils, MetaboAnnotation and CompoundDb packages that together provide a modular infrastructure for the annotation of untargeted metabolomics data. Initial annotation can be performed based on MS1 properties such as m/z and retention times, followed by an MS2-based annotation in which experimental fragment spectra are compared against a reference library. Such reference databases can be created and managed with the CompoundDb package. The ecosystem supports data from a variety of formats, including, but not limited to, MSP, MGF, mzML, mzXML, netCDF as well as MassBank text files and SQL databases. Through its highly customizable functionality, the presented infrastructure allows to build reproducible annotation workflows tailored for and adapted to most untargeted LC-MS-based datasets. All core functionality, which supports base R data types, is exported, also facilitating its re-use in other R packages. Finally, all packages are thoroughly unit-tested and documented and are available on GitHub and through Bioconductor
    corecore